Improving Collective I/O Performance Using Threads
نویسندگان
چکیده
Massively parallel computers are increasingly being used to solve large, I/O intensive applications in many different fields. For such applications, the I/O requirements quite often present a significant obstacle in the way of achieving good performance, and an important area of current research is the development of techniques by which these costs can be reduced. One such approach is collective I/O, where the processors cooperatively develop an I/O strategy that reduces the number, and increases the size, of I/O requests, making a much better use of the I/O subsystem. Collective I/O has been shown to significantly reduce the cost of performing I/O in many large, parallel applications, and for this reason serves as an important base upon which we can explore other mechanisms which can further reduce these costs. One promising approach is to use threads to perform the collective I/O in the background while the main thread continues with other computation in the foreground. In this paper, we explore the issues associated with implementing collective I/O in the background using threads. The most natural approach is to simply spawn off an I/O thread to perform the collective I/O in the background while the main thread continues with other computation. However, our research demonstrates that this approach is frequently the worst implementation option, often performing much more poorly than just executing collective I/O completely in the foreground. To improve the performance of thread-based collective I/O, we developed an alternate approach where part of the collective I/O operation is performed in the background, and part is performed in the foreground. We demonstrate that this new technique can significantly improve the performance of thread-based collective I/O, providing up to an 80% improvement over sequential collective I/O (where there is no attempt to overlap computation with I/O). Also, we discuss one very important application of this research which is the implementation of the split-collective parallel I/O operations defined in MPI 2.0.
منابع مشابه
Improving MPI-IO Output Performance with Active Buffering Plus Threads
Efficient collective output of intermediate results to secondary storage becomes more and more important for scientific simulations as the gap between processing power/interconnection bandwidth and the I/O system bandwidth enlarges. Dedicated servers can offload I/O from compute processors and shorten the execution time, but it is not always possible or easy for an application to use them. We p...
متن کاملMTIO - A Multi-Threaded Parallel I/O System
This paper presents the design and evaluation of a multithreaded runtime library for parallel I/O. We extend the multi-threading concept to separate the compute and I/O tasks in two separate threads of control. Multi-threading in our design permits a) asynchronous I/O even if the underlying file system does not support asynchronous I/O; b) copy avoidance from the I/O thread to the compute threa...
متن کاملOrthrus: A Framework for Implementing Efficient Collective I/O in Multi-core Clusters
Optimization of access patterns using collective I/O imposes the overhead of exchanging data between processes. In a multi-core-based cluster the costs of inter-node and intra-node data communication are vastly different, and heterogeneity in the efficiency of data exchange poses both a challenge and an opportunity for implementing efficient collective I/O. The opportunity is to effectively exp...
متن کاملCollective input/output under memory constraints
Compared with current high-performance computing (HPC) systems, exascale systems are expected to have much less memory per node, which can significantly reduce necessary collective input/output (I/O) performance. In this study, we introduce a memory-conscious collective I/O strategy that takes into account memory capacity and bandwidth constraints. The new strategy restricts aggregation data tr...
متن کاملPerformance Analysis of the Subgroup Method for Collective I/O
As many scientific applications require large data processing, the importance of parallel I/O has been increasingly recognized. Collective I/O is one of the considerable features of parallel I/O and enables application programmers to easily handle their large data volume. In this paper we measured and analyzed the performance of original collective I/O and the subgroup method, the way of using ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999